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1  Introduction 

A  tremendous  amount  of  information  is  available  in  the 
contact  forces  of  manipulation.  Figure  1  shows  a  spec¬ 
trogram  of  an  impact  event.  The  impact  results  in  an 
increase  in  energy  at  all  frequencies  locally  around  the 
event,  and  a  persistent  residual  vibration  at  the  sensor’s 
natural  frequency.  This  signal,  like  all  contact  force  sig¬ 
nals,  can  be  broken  into  regions  that  are  similar.  We 
term  the  different  regions  contact  states  and  the  transi¬ 
tions  between  regions  contact  events. 

During  a  manipulation  task  like  grasping,  pushing,  or 
typing,  numerous  such  events  occur.  One  approach  to 
robot  programming,  for  any  of  these  tasks,  is  to  write  a 
program  component  that  detects  and  labels  each  event 
and  a  component  which  generates  an  appropriate  action 
based  on  the  sensed  event.  The  two  components  must 
mesh  because  each  provides  a  decision  or  knowledge  con¬ 
text  for  the  other.  The  current  action  selects  possible 
interpretations  of  sensor  signals,  and  the  changes  in  sen¬ 
sor  signals  guide  the  choice  of  new  actions.  In  general, 
most  robot  programming  has  used  an  action  centered 
paradigm.  That  is  the  programmer  first  determines  the 
sequence  of  actions  that  should  be  performed  and  then 
determines  how  to  use  the  available  sensors  to  guide  the 
actions.  The  guarded  move  is  the  typical  sensing/ action 
strategy  that  results.  Brock  [8]  provides  a  recent  study 
and  review  of  this  approach. 

This  paper  is  based  on  a  sensing  centered  program¬ 
ming  paradigm.  In  this  approach,  the  properties  char¬ 
acteristic  of  the  sensor  signals  alone  are  determined  and 
then  an  event  detector  for  changes  in  those  properties 
is  designed.  Ideally  these  properties  are  not  biased  by 
any  particular  task  model  and  therefore  are  useful  for  all 
tasks.  Because  the  approach  is  sensor  based,  the  types 
of  allowed  contexts  are  controlled  by  the  possible  inter¬ 
pretations  given  the  sensor  measurements.  The  actions 
are  then  designed  based  on  these  possible  contexts.  This 
paper  discusses  our  work  on  identifying  characteristic 
properties  of  contact  force  signals. 

In  manipulation,  force  and  position  events  guide  the 
task.  This  paper  focuses  on  force  signab  since  little  work 
has  been  done  on  interpreting  these  signals  during  com¬ 
mon  tasks.  Earlier  investigations  of  this  problem,  by 
other  researchers,  focused  primarily  on  designing  and 
proving  sensor  technologies  tailored  to  different  contact 
events.  In  contrast,  this  paper  presents  algorithms,  de¬ 
rived  from  the  theory  of  sequential  hypothesis  testing, 
that  are  designed  for  detecting  contact  events  and  label¬ 
ing  contact  states. 

Labeling  is  much  more  difficult  then  detecting  changes 
in  signal  characteristics.  In  general,  labeling  requires 
knowledge  about  the  possible  sources  of  the  signals  and 
the  source  characteristics.  This  research  looked  at  label¬ 
ing  some  simple  events  without  context. 

To  our  knowledge,  this  is  the  first  investigation  of  the 
discrimination  of  contact  events  and  the  first  application 
of  sequentird  decision  theory  to  this  problem.  Although 
we  applied  the  techniques  to  signals  from  an  intrinsic 
contact  sensor  [5],  the  ideas  can  be  applied  to  any  form 
of  tactile  sensor  to  enhance  performance.  Based  on  the 
experimental  results  in  this  paper,  we  are  investigating  a 


sensor  that  is  a  combination  of  the  intrinsic  contact  sen¬ 
sor  and  a  piezoelectric  film.  This  sensor,  which  is  similar 
in  design  to  [12],  has  better  dynamic  characteristics  and 
should  give  better  performance. 

In  the  following  sections,  we  first  summarize  the  work 
that  has  been  done  on  detecting  grasping  events  and 
on  human  models  of  temporal  tactile  sensing.  We  then 
briefly  discuss  our  experiments  and  the  signal  models. 
Then  sequential  hypothesis  testing  is  introduced  and 
used  to  derive  the  necessary  statistical  tests.  Next  our 
algorithm  is  presented  and  related  to  the  general  theory. 
We  then  present  the  experimental  results  of  our  algo¬ 
rithm,  discuss  its  theoretical  characteristics  and  compare 
it  to  other  techniques.  We  conclude  by  discussing  the  re¬ 
search  issues  in  contact  event  perception,  and  our  plans 
for  future  development. 

2  Previous  Work 

During  the  last  decade,  considerable  research  has  been 
performed  on  tactile  sensing.  Howe  [13]  provides  the 
most  recent  comprehensive  review  of  current  and  past 
research.  Most  of  this  research  has  focused  on  design¬ 
ing  surface  array  sensors  and  using  these  sensors  for  ob¬ 
taining  geometric  information  from  static  measurements. 
Some  research  has  looked  at  the  information  that  can 
be  acquired  by  actively  moving  the  contact  sensor  and 
monitoring  both  the  sensor  and  joint  locations.  This  is 
termed  haptic  sensing.  Primarily  prior  haptic  research 
has  focused  on  actively  tracing  the  contours  of  objects 
to  determine  geometry  and  critical  features  [7,  27).  This 
work  assumes  that  each  measurement  is  taken  with  the 
force  sensor  in  a  quasi-static  state  so  that  normal  forces 
and  contact  locations  can  be  computed.  All  of  this  work 
essentially  treats  the  tactile  array  sensor  as  a  primitive 
form  of  vision. 

In  contrast,  a  more  recent  type  of  contact  sensor  pro¬ 
cessing  is  the  detection  of  information  characteristic  of 
the  dynamic  aspects  of  motion  [13].  Mechanical  proper¬ 
ties  of  objects  like  mass,  friction,  and  damping  can  only 
be  determined  by  actively  probing  and  manipulating  the 
object.  Similarly,  the  initial  contact  with  an  object,  and 
slip  between  the  sensors  and  environment  require  detect¬ 
ing  relative  motion.  AH  of  these  sensing  modalities  are 
unique  to  contact  force  sensing. 

A  few  studies  have  been  done  on  this  type  of  sens¬ 
ing.  By  monitoring  the  acoustic  emission  from  a  metal 
gripper,  Dornfeld  [9]  was  able  to  detect  the  onset  of  slip 
for  some  metallic  workpieces.  Howe  and  Cutkowsky  [12] 
constructed  an  instrumented  latex  covered  finger.  Piezo¬ 
electric  sensors  are  embedded  in  the  latex  cover,  and  a 
miniature  accelerometer  is  mounted  on  the  inside  surface 
of  the  cover.  The  piezoelectric  sensors  are  very  sensitive 
to  strain  rate.  Because  of  the  small  mass  of  the  cover, 
the  accelerometers  are  sensitive  to  very  small  forces  nor¬ 
mal  to  the  surface  of  the  sensor.  They  found  that  the 
piezoelectric  sensor  was  very  sensitive  to  the  changes  in 
tangential  strain  associated  with  slip,  and  that  the  ac¬ 
celerometer  was  fairly  sensitive  to  the  vibrations  normal 
to  the  sensor  associated  with  slip. 

Bicchi  [6]  used  a  six-axis  fingertip  force-torque  sensor 
to  estimate  the  onset  of  slip  (figure  2).  This  sensor  has  a 


03 
Tini»(*ec) 


Figure  1:  Spectrogram  of  an  impact  event.  The  figure  shows  a  contour  plot  of  the  energy  in  frequencies  from  200-1350 
Hz  as  a  function  of  time.  The  signal  was  sampled  at  2700  Hz.  Sixty-four  points  windowed  with  a  Hamming  window 
were  used  for  each  fast  fourier  transform  (fft).  The  fft  was  computed  for  every  new  data  point.  Note  the  broad 
frequency  band  that  occurs  at  an  impact  and  the  short  time  scale  of  this  event. 
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Figure  2;  6^axis  fingertip  force-torque  sensor 


Maltese-cross  connecting  the  outer  shell  to  the  base.  The 
cross  is  instrumented  with  8  strain-gauge  half-bridges. 
The  shell  has  a  lightly  damped  natural  frequency  of  ap¬ 
proximately  690  Hz  when  the  base  is  fixed  and  the  shell 
free.  In  his  experiments,  Bicchi  first  determined  a  coef¬ 
ficient  of  friction  for  the  object  to  be  grasped.  Then,  by 
monitoring  the  ratio  of  the  tangential  force  to  the  normal 
force,  he  was  able  to  determine  when  the  contact  state 
was  approaching  the  slip  condition  determined  earlier. 

All  of  these  methods  generally  make  decisions  about 
the  contact  state  based  on  the  instantaneous  values  of 
the  measured  signals.  In  some  cases,  a  lowpass  filter 
may  be  introduced  to  reduce  the  “noise”  in  the  signal. 
One  exception  is  McCarragher  [20].  McCarragher  ex¬ 
amined  the  planar  assembly  process  and  constructed  a 
discrete  event  dynamic  system  controller  to  make  deci¬ 
sions  about  the  current  configuration  of  the  parts  based 
on  the  history  of  force  measurements.  He  used  quali¬ 


tative  reasoning  on  the  assembly  dynamics  to  construct 
interpretations  of  the  force  signal,  and  a  Petri  net  to 
provide  decision  context. 

In  contrast,  the  contribution  of  this  paper  is  to  show 
how  the  entire  history  of  the  signal  can  be  used  to  make 
decisions  in  a  statistically  robust  way  using  the  tech- 
niquies  of  sequential  decision  theory.  This  technique  is 
applicable  to  all  of  the  sensors  that  have  been  investi¬ 
gated  for  dynamic  contact  sensing.  In  our  experiments 
we  have  applied  it  to  the  6-axis  fingertip  force-torque 
sensor. 

In  this  paper,  only  the  individual  strain  gauge  sig¬ 
nals  produced  by  the  sensor  (figure  2)  are  sensed.  Elach 
signal  is  treated  as  independent  in  order  to  determine 
what  information  can  be  extracted  about  the  contact 
process  purely  from  the  characteristics  of  the  scalar  sig¬ 
nal.  Without  additional  measurements  or  prior  knowl¬ 
edge,  all  event  decisions  must  be  based  on  characteristics 
of  the  individual  strain  time  series.  We  test  the  signal  for 
whiteness,  changes  in  mean,  changes  in  vibration  level,  ** 
short  discontinuities  caused  by  impulses,  and  spectral 
structure.  The  tests  are  formulated  as  hypothesis  test¬ 
ing  problems  based  on  experimental  models  of  the  signal 
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3  Human  Capabilities 

Some  insights  on  the  design  of  robot  contact  sensing  al¬ 
gorithms  may  be  gained  by  a  study  of  human  temporal 
contact  sensing.  A  robot  sensing  through  an  intrinsic 
contact  sensor  or  a  single  piezoelectric  sensor  is  anal¬ 
ogous  to  human  exploration  using  a  stick.  The  stick 
encodes  the  distributed  contact  information  into  a  tem¬ 
poral  force  signal  which  is  transmitted  along  the  length 
of  the  stick.  A  surprising  amount  of  information  can 
be  gained  purely  through  this  channel.  Our  goal  in  de¬ 
signing  intrinsic  contact  sensors  and  algorithms  is  to  un¬ 
derstand  and  at  least  match  human  performance  in  this 
mode  of  exploration. 

Research  on  human  tactile  perception  has  shown  that 
there  are  four  types  of  mechanoreceptors  [14,  16,  15].  A 
fair  amount  is  known  about  the  frequency  response  of 
these  receptors  to  tightly  defined  inputs.  The  Merkel 
disks  and  the  Meissner  corpuscles  are  type  I  popula¬ 
tions  and  are  near  the  surface  and  closely  spaced.  These 
surface  sensors  are  primarily  sensitive  to  low  frequency 
information.  The  Merkel  disks  respond  to  static  and 
slow  changes  with  a  lowest  threshohi  at  about  10  Hz. 
The  Meissner  corpuscles  respond  only  to  changing  sig¬ 
nals  with  a  lowest  threshold  at  about  30  Hz.  Most  robot 
tactile  arrays  attempt  to  duplicate  the  spacing  and  ca¬ 
pabilities  of  these  two  sensors. 

The  deep  skin  sensors,  type  II,  are  more  widely  spaced 
with  a  density  of  20/cm^  and  a  10  mm  receptive  field. 
The  Pacianian  corpuscles  have  a  sensitive  region  from  50 
to  400  Hz.  The  Ruffini  organs  are  directionally  sensitive 
and  sense  skin  stretch. 

Johnson  [17]  provides  a  recent  comprehensive  review 
of  the  tactile  sensing  system  and  provides  a  working  hy¬ 
pothesis  for  the  role  played  by  three  of  the  four  basic 
systems:  the  SAI,  RA,  and  PC  systems.  The  SAI  sys¬ 
tem  is  the  slowly  adapting  type  I  population,  ending 
in  the  Merkel  disks^  and  all  the  pathways  conveying  its 
signals  to  memory  and  perception.  Similarly,  the  RA 
system  is  the  rapidly  adapting  type  I  population  end¬ 
ing  in  the  Meissner  corpuscles,  and  the  PC  system  is 
the  rapidly  adapting  type  II  population  ending  in  the 
Pacianian  corpuscles. 

The  SAI  system  is  thought  to  be  responsible  for  en¬ 
coding  low  frequency,  widely  separated,  spatial  contact 
signals.  Experiments  with  Braille  and  ^man  letters 
show  that  the  SAI  system  provides  an  isomorphic  im¬ 
age  of  the  contact  signal.  Therefore  the  SAI  system  is 
thought  to  be  responsible  for  directly  imaging  the  con¬ 
tact  shape. 

The  encoding  of  texture  is  still  being  determined.  It 
is  clear  that  the  perception  of  texture  requires  relative 
movement  between  the  skin  and  the  surface.  For  widely 
spaced  textures,  the  SAI  units  are  able  to  determine  the 
spatial  distribution.  However  for  very  fine  textures  and 
slip,  vibration  sensing  seems  to  be  critical.  Both  the  RA 
and  the  PC  systems  appear  to  be  involved. 

Recent  studies  have  shown  that  slip  is  detected,  for 
textured  surfaces,  by  these  rapidly  adapting  fibers  when 
a  vibration  is  set-up  by  the  relative  motion  of  a  tex¬ 
tured  surface  [26].  The  spatial  distribution  of  the  con¬ 
tact  is  of  only  -econdary  importance.  Relative  motion 


of  very  small  raised  dots  (4  /im  high,  550  fim  diameter 
single  dots,  and  1  pm  dots  spaced  at  100  pm  center-to- 
center)  on  a  smooth  plate  produced  reliable  activation 
of  the  fibers.  The  direction  of  slip  is  determined  by  the 
slowly  adapting  fibers  which  measure  skin  stretch.  This 
suggests  that  the  information  about  the  onset  of  slip  is 
carried  by  the  high  frequency  component  of  the  contact 
force,  and  that  the  direction  of  slip  is  curied  by  the  di¬ 
rection  of  skin  stretch. 

In  addition,  textured  surfaces  can  be  differentiated  by 
scanning  a  rigid  probe  across  a  surface.  The  probe  en¬ 
codes  only  temporal  information.  The  induced  vibration 
level  seems  to  be  used  for  discrimination.  The  PC  system 
is  the  most  likely  candidate  for  the  transduction  system, 
since  it  is  the  only  system  excited  by  the  impulses  that 
occur  during  the  initial  placement  of  an  object.  Neuro¬ 
physiological  study  of  this  mechanism  for  perception  has 
just  begun. 

In  summary,  the  SAI,  RA,  and  PC  seem  to  have  a 
separation  of  function  loosely  based  on  frequency.  This 
separation  is  used  in  perceiving  the  contact  state  while 
doing  manipulation  with  a  tool.  The  SAI  system  is  tuned 
to  determine  the  contact  distribution  between  the  hand 
and  the  tool.  The  PC  system  is  able  to  detect  the  high 
frequency  events  that  are  transmitted  down  the  tool  to 
the  hand  and  is  insensitive  to  the  shape  of  the  tool.  And 
the  RA  system  has  a  lower  spatial  sensitivity  then  the 
SAI  system  but  is  more  sensitive  to  vibration  and  so 
can  determine  local  movements  between  the  tool  and 
the  hand. 

This  work  suggests  that  it  is  possible  to  extract  infor¬ 
mation  about  texture  and  impacts  with  an  intrinsic  con¬ 
tact  sensor.  Like  the  PC  system,  the  algorithms  should 
look  for  high  frequency  events.  In  addition,  the  low  fre¬ 
quency  response  from  the  contact  sensor  should  be  re¬ 
lated  to  the  neural  encodings  for  joint  torques.  Finally, 
results  developed  for  the  temporal  response  from  a  sin¬ 
gle  contact  sensor,  may  be  extendable  to  analyzing  the 
temporal  response  from  a  sensor  array. 

4  Signal  Models 

In  order  to  detect  2md  label  different  events,  models  of 
the  different  signab  had  to  be  developed.  We  built  two 
pieces  of  experimental  apparatus  in  order  to  character¬ 
ize  the  response  of  the  system  to  impacts,  sliding  across 
a  surface,  no  contact,  and  grasping  contacts.  Statbti- 
cal  process  modeb  were  developed  based  on  the  experi¬ 
ments.  These  were  captured  as  a  set  of  signal  source  hy¬ 
potheses.  The  next  subsection  described  the  experiments 
and  the  signal  modeb,  and  the  last  subsection  dbcusses 
the  collection  of  the  modeb  into  source  hypotheses. 

4.1  Signal  Examples  and  Models 

We  considered  four  basic  contact  signab:  impacts,  slip, 
no  contact,  and  grasping  contacts.  For  each  basic  con¬ 
tact,  an  experiment  was  developed  to  isolate  that  par¬ 
ticular  event.  Based  on  these  experiments,  a  statistical 
signal  model  was  developed  by  testing  the  description 
against  the  data.  The  assumption  that  these  signab 
were  basic  and  could  be  used  to  span  the  contact  set 


Figure  3:  Impact  Test  Apparatus 


was  tested  by  considering  some  examples  of  more  com¬ 
plex  manipulation.  This  is  discussed  in  the  section  6. 

4.1.1  Impact  Model 

To  gather  impact  data,  a  solenoid  released  impact 
hammer  was  placed  on  a  knife  pivot  (figure  3).  The  eight 
strain  gauge  signals  resulting  from  the  hammer  striking 
the  fingertip  sensor  was  recorded  at  2.7  kHz.  The  im¬ 
pact  load  could  be  varied  by  changing  the  initial  height 
of  the  force  sensor  with  the  micrometer.  A  representative 
impact  signal  from  one  strain  gauge  and  its  correspond¬ 
ing  Fast  Fourier  Transform  (FFT)  is  shown  in  figure  4. 
There  are  a  number  of  important  characteristics  in  this 
signal.  First,  three  strikes  by  the  hammer  are  shown  by 
the  three  sharp  rises  in  the  figure.  Each  of  these  strikes 
is  separated  by  approximately  150  ms.  Second,  each  of 
the  sharp  impact  signals  decays  very  quickly  (approxi¬ 
mately  1.5  ms).  Finally,  the  sensor  continues  to  ring  for 
an  extensive  period,  over  0.5  sec,  after  the  impacts  (not 
shown).  Finally,  the  rapid  vibration  that  follows  each 
impact  events  is  at  approximately  1000  Hz. 

As  an  initial  model  of  the  impact  process  consider  the 
ideal  model  shown  in  figure  5.  In  this  model,  each  impact 
of  M2  with  Ml  is  an  inelastic  collision  with  coefficient  of 
restitution  e.  The  time  and  distance  of  the  collision  are 
small  compared  to  the  travel  time  and  distance  of  M2. 
Under  these  assumptions,  the  process  can  be  modeled 
as  a  series  of  impulses  applied  to  Mi.  Let  x,  be  the 
normalized  height  of  M2  after  collision  <  with  xo  =  1; 
wq  =  KiJMi  be  the  natural  frequency  of  the  sensor; 
Tf  =  uoy/2h/g  be  the  natural  unit  of  time,  with  h  being 
the  true  initial  height  of  M2',  U  be  the  time  of  the  ith 
collision  with  ti  =  ei;;  z  be  the  normalized  displacement 
of  xi;  =  M2/M1;  and  C  the  damping  coefficient  which 
is  less  than  1.  The  equations  of  motion  for  the  model 
are  then: 

Xi+i  =  e^Xi 

fi+i  =  ti  +  2ei;v^ 

^  j=l 

Using  linearity,  and  the  convolution  properties  of  6,  the 
trajectory  for  Mi  is 

z(t)  =  ^7,-  E  Vg7exp-<<*-*>)sin(v/rr^(f-f,)) 

C  ti<« 


Figure  4;  Representative  impact  signal  from  one  strain 
gauge  bridge.  There  are  128  samples  taken  at  2.7  kHz. 
The  FFT  was  computed  using  a  square  window.  The 
signal  mean  has  been  removed  from  the  FFT. 


M2 


Figure  5:  Ideal  model  of  the  impact  process.  An  inelastic 
collision  occurs  at  the  boundary  over  time  and  distance 
that  are  small  compared  to  the  travel  time  and  distance 
for  the  impacting  mass. 
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Figure  6:  Simulation  of  model  response  using  first  20 
impacts  with:  e  =  0.85,  C  =  0.1,  /?  =  1,  i/  =  5.5 

Simulations  of  this  model  result  in  some  of  the  phe¬ 
nomena  displayed  in  the  actual  data.  Figure  6  shows 
the  sudden  increase  in  displacement,  the  slow  exponen¬ 
tial  decay  in  the  signal,  and  the  discontinuous  effect  of 
secondary  impacts.  However,  the  simulation  does  not 
display  the  disorder  seen  m  the  actual  signal,  and  is  a 
poor  match  to  the  data.  A  more  detailed  model  for  this 
process  treats  the  process  as  an  initial  condition  response 
for  a  autoregressive  (AR)  model  of  unknown  order.  We 
were  able  to  get  good  fits  to  each  impact  event  with 
a  fourth  order  auto-regressive  model.  An  impact  event 
lasts  for  about  50  samples.  The  first  four  values  of  the 
impact  signal  were  used  to  create  the  initial  conditions 
and  the  parameters  of  the  autoregressive  model  were  es¬ 
timated  using  maximum  likelihood  (MLE)  over  the  re¬ 
maining  samples. 

The  majority  of  the  impact  signal  energy  is  in  the 
first  four  samples.  The  remaining  energy  is  in  the  ex¬ 
tended  ringing  that  follows  the  impact  event.  The  en¬ 
ergy  in  the  initial  part  is  captured  in  the  model  by  using 
the  first  four  values  as  initial  conditions.  The  ringing 
is  captured  by  the  autoregressive  coefficients.  Unfortu¬ 
nately,  the  MLE  estimator  is  nonlinear  and  uses  all  of 
the  data.  Therefore  the  estimates  have  to  be  generated 
through  numerical  search  which  make  the  model  imprac¬ 
tical  for  real-time  implementation.  Instead  we  adopted 
the  simpler  approach  of  computing  a  Krirhunen-Loeve 
(KL)  expansion  for  the  impact  shape  based  on  empiri¬ 
cally  segmented  training  data. 

Seventy-two  training  example  were  generated  using 
the  impact  apparatus.  The  start  of  the  impact  was 
found  by  looking  for  the  first  point  in  the  signal  that  was 
twenty  or  more  standard  deviations  away  from  the  cal¬ 
ibrated  initial  conditions.  Forty  points  were  then  taken 
from  this  starting  point  to  get  seventy-two  examples  of 
length  forty.  The  mean  was  removed  from  each  sam¬ 
ple  individually,  and  then  each  sample  was  normalized 
to  have  unit  energy.  Normalization  prevented  any  one 
signal  from  dominating  the  result  and  since  the  test  is 
a  ratio  between  different  hypotheses  the  energy  is  not 
important.  The  Karhunen-Loeve  expansion,  which  is  an 
eigenvalue  expansion  of  the  sample  covariance,  was  then 
computed.  The  majority  of  the  energy  was  contained 
in  the  first  four  eigenvalues;  therefore,  these  were  made 
additional  features  for  the  impact  model.  The  features 
were  extended  beyond  forty  samples  by  adding  zeros  to 


Figure  7:  Slip  Test  Apparatus 

the  end  of  the  feature  signal.  The  residual  after  fitting 
the  mean  and  four  features  was  modeled  as  white  normal 
noise.  The  resulting  impact  model  for  the  measurement 
y(f)  is 

4 

i=l 

c(t)  ~  ^{0,V).  (2) 

In  this  model,  the  measurement,  y,  is  a  mezm  fj  plus  a 
linear  combination  of  four  features  st ,  which  are  treated 
as  being  orthogonal  to  /j,  and  a  normal  process  noise 
e(t).  The  shape  of  the  impact  signal  and  its  residual 
vibration  is  captured  by  Sk,  the  new  mean  is  captured 
by  fi.  In  building  this  model,  the  implicit  assumption  is 
that  the  shape  of  the  impacts  is  approximately  the  same. 
This  turned  out  to  be  true  only  in  the  ideal  case. 

4.1.2  SUp  Model 

Slip  data  was  collected  with  the  experimental  appa¬ 
ratus  shown  in  figure  7.  The  sensor  was  pulled  along 
the  base  with  constant  normal  load.  The  load  could  be 
changed  by  adding  weights  to  the  sensor  mount.  Pieces 
of  sandpaper  were  used  to  control  the  surface  roughness. 
These  were  placed  part  way  along  the  base  of  the  ap¬ 
paratus.  A  section  of  the  signal  generated  from  passing 
over  the  sandpaper  is  shown  in  figure  8.  The  FFT  shows 
that  the  spectrum  is  essentially  flat  out  to  the  bandwidth 
of  the  sensor  at  which  point  it  rapidly  decays.  An  ap¬ 
proximate  model  is  then  given  by  a  normal  white  noise 
process 

y(0  ~  iV(p,  V). 

This  model  was  also  confirmed  by  a  histogram  and  the 
autocorrelation  function. 

4.1.3  No  Contact  Model 

A  sample  of  the  strain  gauge  signal  without  any  con¬ 
tact  is  shown  in  figure  9.  Although  the  sensor  is  not 
being  contacted,  background  vibrations  are  picked-up 
through  the  base.  In  this  mode,  the  sensor  is  acting  as  an 
accelerometer.  The  FFT  of  the  signal  is  again  fairly  flat, 
and  the  autocorrelation  also  indicates  that  the  signal  is 
approximately  white.  For  small  vibrations,  quantization 
noise  would  be  the  dominant  error  which  has  a  uniform 
distribution.  However,  the  vibration  levels  are  such  that 
a  normal  model  is  a  better  fit  to  the  data  as  indicated 
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Figure  8:  Representative  section  of  signal  generated  by 
sliding  over  a  piece  of  sandpaper.  512  samples  taken 
at  2.7  kHz  and  the  corresponding  FFT  using  a  square 
window.  The  signal  is  fairly  flat  out  to  the  bandwidth 
of  the  sensor. 


Figure  9:  Representative  rest  signal  from  one  strain 
gauge  bridge  sampled  at  2.7  kHz  and  the  correspond¬ 
ing  FFT  using  a  square  window. 


by  the  histogram.  Therefore  an  appropriate  model  of  the 
signal  is 


y{t)  ^  N{^io,Vo). 

This  model  is  distinguished  by  the  particular  values  of 
Ho  and  Vb-  To  recognize  this  model  these  values  must  be 
calibrated. 

4.1.4  Grasping  Contact 

Finally,  the  fingertip  sensor  was  grasped  by  hand  and 
forces  were  applied  (figure  10).  A  slip  free  grasp  was 
maintained  by  relying  on  human  capabilities  for  detect¬ 
ing  slip.  The  FFT  of  the  signal  shows  a  lowpass  response, 
and  an  appropriate  model  is  an  autoregressive  process 

p 

J=i 

e(t)  ~  JV(0,V). 

4.2  Hypothesis  Models 

The  goal  of  this  work  is  to  segment  any  strain  signal 
into  pieces  which  can  be  identified  with  one  of  the  experi¬ 
mentally  selected  models.  Segmentation  can  be  achieved 
by  designing  a  decision  algorithm  which  is  tuned  to  our 
model  set.  Therefore  we  collected  the  parametric  models 


GntpafSiiaal 


Figure  10:  Representative  grasp  signal.  512  samples  at 
2.7  kHz  and  the  corresponding  FFT  using  a  square  win¬ 
dow.  The  fingertip  sensor  was  clamped  in  place  and  then 
loads  were  applied  by  grasping  and  twisting  the  sensor 
housing. 
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of  the  signal  into  six  basic  signal  'sources,  or  hypotheses: 
NuU  State  (HO):  y{i)  ~  Nino,  Vo) 

New  Mean  (HI):  y(0  —  N{ft,  Vq) 

New  Noise  Level  (H2):  y(l)  ~  7V(/io,  V)) 

New  Mean  and 

Noise  Level  (H3):  y(t)  ~  N{ni,Vi) 

Impact  Signal  (H4):  y(0  = 

e(/)~7v(0TVx) 

Grasping  Signal  (H5):  y(t)  =  a,y(t  -  j)  +  e(t) 
t{t)  ~  n(0,Vi). 

This  set  of  source  can  be  viewed  as  asking  a  set  of 
questions  about  the  statistics  and  spectral  properties  of 
the  strain  signal  : 

1 .  Is  the  signal  white  or  is  there  significant  correlation? 

2.  Has  the  mean  of  the  driving  noise  changed? 

3.  Has  the  variance  of  the  driving  noise  changed? 

4.  Is  the  mean  and/or  variance  from  the  base  case  of 
no  contact? 

5.  Are  there  any  impact  signatures  in  the  time  series? 

A  segmentation  and  identification  procedure  was  de¬ 
rived  based  on  these  parametric  models.  The  procedure 
is  based  on  the  generalized  likelihood  ratio  test  coupled 
with  the  minimum  description  length  principle.  This  ap¬ 
proach  offers  a  number  of  advantages  over  more  adhoc 
procedures.  First,  its  model  based.  The  decision  proce¬ 
dure  follows  directly  from  the  models  described  above. 
Second,  it  is  an  optimal  procedure,  within  the  context  of 
the  models,  when  such  a  procedure  exists.  Lastly,  it  ex¬ 
plicitly  estimates  the  time  of  events,  which  is  a  property 
that  most  filter  based  approaches  lack. 

The  next  section  5  presents  the  theoretic  basis  for 
our  decision  algorithm.  It  begins  with  a  slightly  more 
abstract  form  of  the  problem  of  statistical  segmenta¬ 
tion  and  identification  in  order  to  frame  the  discussion. 
The  implicit  assumptions  about  the  measurement  pro¬ 
cess  used  in  the  generalized  likelihood  ratio  are  then  jus¬ 
tified.  The  test  is  then  presented  with  a  sequence  of 
examples  each  of  which  is  interesting  in  its  own.  For 
example,  the  procedure  for  testing  for  a  change  in  mean 
from  a  known  value  is  the  optimal  guarded  move  sensing 
strategy. 

Finally  section  5.4  discusses  the  problem  of  labeling 
multiple  parameterized  models  which  is  the  problem  pre¬ 
sented  by  our  set  of  hypotheses.  In  this  case,  the  problem 
of  uniformly  penalizing  the  free  parameters  arises.  We 
discuss  the  two  basic  approaches  that  have  been  used  in 
the  literature  and  present  a  justification  for  the  choice 
of  the  minimum  description  length  principle  (MDL).  Fi¬ 
nally,  the  theory  section  concludes  by  presenting  the  al¬ 
gorithm  in  detailed  form  including  a  discussion  of  effi¬ 
cient  parameter  estimation  algorithms.  Results  are  pre¬ 
sented  in  section  6.  We  applied  the  algorithm  to  both  the 
training  examples  discussed  above  and  on  more  general 
tasks. 

5  Sequential  Hypothesis  Testing 

In  order  to  develop  algorithms  for  processing  dynamic 
contact  information,  we  introduce  a  sequential  hypoth¬ 
esis  testing  model  of  the  sensing  process.  This  area  has 


been  an  active  area  of  research  in  statistics  and  signal 
processing  since  its  initial  development  by  Wald  [28].  A 
mathematical  review  is  given  by  Siegmund  [25].  There 
have  been  a  number  of  important  results  during  the  last 
decade  [3,  29].  These  methods  are  relevant  to  any  sig¬ 
nal  processing  task  which  can  be  modeled  as  a  stochas¬ 
tic  measurement  process  on  an  underlying  system  which 
undergoes  discontinuous  changes.  The  methods  are  par¬ 
ticularly  useful  when  accurate  imd  rapid  decisions  about 
the  time  of  change  are  required.  This  includes  edge  de¬ 
tection,  continuous  speech  segmentation,  and  dynamic 
contact  sensing. 

The  most  powerful  hypothesis  testing  procedure,  i.e. 
the  one  that  uses  the  most  prior  information,  is  Bayesian 
decision  theory  on  a  semi-Markov  chain.  An  approach 
using  this  model  would  estimate  the  probability  of  be¬ 
ing  in  every  node  in  tne  chain  and  would  consider  every 
possible  sequence  of  transitions  on  the  graph  to  explain 
the  sequence  of  measurements.  This  can  be  computa¬ 
tionally  complex.  In  addition,  this  approeub  requires  a 
prior  probability  distribution  for  the  transition  probabil¬ 
ities,  the  holding  times,  and  any  parameter  values  which 
are  difficult  to  develop.  An  alternative  procedure  is  the 
sequential  likelihood  ratio. 

In  sequential  hypothesis  testing  it  is  assumed  that  the 
time  for  the  algorithm  to  detect  a  transition  is  short 
compared  to  the  holding  time  before  a  second  transi¬ 
tion.  Therefore  it  is  assumed:  1)  that  transitions  can 
be  detected  by  considering  only  the  data,  2)  that  at  any 
time  only  one  hypothesis  needs  to  be  assumed  to  be  true, 
and  3)  only  one  transition  from  this  hypothesis  needs  to 
be  considered.  These  assumptions  make  the  problem 
complexity  at  most  linear  in  the  number  of  samples. 

The  next  subsection  discusses  the  sequential  hypothe¬ 
sis  testing  approach  in  general.  Then,  a  set  of  important 
special  cases  is  presented.  First,  the  simple  hypothesis 
testing  problem  of  testing  between  known  distribution 
is  presented.  This  problem  arises  often  in  practice  and 
is  the  easiest  to  treat  theoretically.  As  an  example,  we 
show  how  easily  contacts  can  be  detected  optimally,  and 
demonstrate  the  increase  in  performance  over  an  opti¬ 
mally  designed  filter  and  threshold  approach.  Then  the 
procedure  for  two  known  signals  in  Gaussian  noise  is 
presented.  The  form  of  the  computations  for  this  test 
appears  in  all  the  more  complex  algorithms. 

Next  we  consider  changes  between  two  parameterized 
distributions.  As  a  special  case  we  consider  testing  for 
an  unknown  change  in  mean.  This  problem  is  very  im¬ 
portant.  Many  problems  can  be  reduced  to  this  problem 
by  appropriate  preprocessing.  In  addition,  all  estimation 
procedures  give  rise  to  an  asymptotically  local  procedure 
which  locks  like  a  test  for  change  in  mean  [4]. 

Finally,  we  consider  the  problem  of  multiple  param¬ 
eterized  distributions.  This  is  the  form  of  our  dynamic 
sensing  algorithm  takes.  The  procedure  requires  a  tech¬ 
nique  for  uniformly  penalizing  the  number  of  free  pa¬ 
rameters,  and  we  adopt  the  minimum  description  length 
principle.  This  section  concludes  by  presenting  the  algo¬ 
rithm  in  detail  including  a  discussion  of  efficient  param¬ 
eter  and  model  order  estimation. 
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5.1  General  Theory 

In  general,  the  meaaurement  could  be  caused  by  any  one 
of  a  set  of  m  hypotheses  (states)  Ti  =  {Hi}.  Each  state 
provides  a  statistical  description  Pi(y(^)<  ••■>y(0) 
measurement  process.  y(i)  is  the  first  p'insor  signal  gen¬ 
erated  from  state  t  and  y(ib)  is  the  last.  We  consider 
only  discrete  measurements.  As  a  series  of  sensor  mea¬ 
surements  yj  =  {y(0), ...,  y(n)}  are  taken,  the  problem  is 
to  determine  the  sequence  of  states  Zg  =  {z(0), ...,  z(n)} 
from  which  the  measurements  were  t^en.  Since  it  is  as¬ 
sumed  that  the  time  between  events  is  sufficiently  long 
for  an  algorithm  to  detect  the  transition,  the  algorithm 
can  run  forward  in  time,  from  the  last  detected  event 
and  state  and  look  only  for  a  single  transition  within  the 
data  under  consideration. 

Initially  the  algorithm  is  given  that  hypothesis  Hp  is 
true.  Then,  for  every  time  r  from  0  to  n  the  likelihood  ' 
that  the  measurements  were  generated  by  Hp  from  time 
0  to  time  r  —  1  and  then  by  a  different  state  Hy  from  time 
r  to  n  is  computed.  This  is  compared  to  the  assumption 
that  all  the  data  came  from  Hp.  Because  of  the  indepen¬ 
dence  assumption  of  state  transitions  and  the  measure¬ 
ment  densities,  the  likelihood  of  hypothesis  p  followed  by 
hypothesis  g  is 

Mp,9,r,yo)  =  Pp(y(»’  - 1) . y(0))Pf(y(n).-,y(»-)). 

and  the  likelihood  that  all  of  the  measurements  came 
from  state  p  is 

i(p.yS)  =  Pp(y(n).  •.y(0)). 

The  optimal  test  is  the  likelihood  ratio  for  the  measure¬ 
ments 

Mp,9,r,s/S)  _  Pp(y(>‘  - 1) . y(0))Pt(y(n).  -.y(>‘)) 

^p,yo)  Pp(y(n),...,y(0)) 

The  decision  function  is  the  maximum  of  the  log  of  this 
ratio  over  the  possible  new  states 


£>^’(P.9.yo) 


max  log 

relOpfi) 


Up,<l>r,yS) 


The  most  likely  new  state  is  q  which  equals 


g=:argmax£>F(p,9,j/J). 
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This  yields  the  test 

DF{p,qX)  I  Tl 

H, 

This  rule  says  that  H ^  will  be  chosen  as  the  new  state  if 
DF{p,q,^)  becomes  larger  than  1^,  otherwise  Hp  will 
be  maintained  as  the  current  hypothesis.  T3  is  the  de¬ 
cision  threshold  and  is  a  design  parameter  that  controls 
the  essentia]  trade-off  between  the  two  types  of  errors. 

*The  likelihood  is  the  conditional  probability  of  receiv¬ 
ing  the  measurements  given  the  hypothesis.  It  is  related  to 
the  probability  that  the  hypothesis  is  true  through  Bayes 
rule  which  required  a  prior  distribution.  The  likelihood  is 
used  when  the  prior  distributions  are  either  unavailable,  or 
assumed  to  be  noninformative. 


There  are  two  important  characteristics  of  this  test 
1)  the  false  alarm  rate,  2)  the  delay  to  detection.  The 
earliest  time  at  which  the  decision  function  exceeds  the 
threshold,  given  that  the  system  is  still  in  state  p,  is 
the  false  alarm  time  t/(p)  =  inf(n  ;  DF(p,q,\/^)  >  T^) 
which  has  distribution  Prf<4(n,p).  The  probability  of 
no  alarm  at  time  n  is  Ptfiijt(n,p)  =  1  -  PrFA(n,p) 
The  asymptotic  false  alarm  rate  is  defined  to  be  /(p)  = 

1  —  linu->oo  reflects  the  rate  at  which 

false  alarms  will  occur  over  the  long-term.  In  contrast, 
the  delay  to  detection  is  a  transient  performance  mea¬ 
sure.  The  delay  to  detection,  given  that  a  change  to  state 
q  occurred  at  time  0,  and  the  decision  function  is  on  state 
g,  is  tD(p,g)  =  inf(n  :  DF(p,q,iiS)  >  T^\x(t)  =  H,). 
The  distribution  of  tD{p,g)  is  Prx>(n,p,  9)  and  its  ex¬ 
pected  value  is  t*D(p.?)  =  Ci^gi  Ptoft.P.q)  Both 
statistics  are  controlled  by  Tp  which  is  a  design  param¬ 
eter.  Increasing  Tp  decreases  the  false  alarm  rate  and 
increases  the  time  to  detection.  Determing  both  of  these 
relationships  requires  solving  a  first  passage  problem. 
Closed  form  solutions  to  this  type  of  problem  are  rare 
and  difficult  to  derive.  However,  for  some  of  the  spe¬ 
cial  cases  considered  below  approximations  can  be  de¬ 
termined  and  are  presented. 


5.2  Changes  Between  Two  Known  Distributions 

The  simplest  and  most  important  special  case  is  detect¬ 
ing  changes  between  two  known,  conditionally  indepen¬ 
dent,  probability  distributions  for  a  signal.  This  prob¬ 
lem  contains  all  of  the  essential  features  of  the  statistical 
decision  procedure.  More  generally,  many  sequential  de¬ 
cision  problems  can  be  treated  by  designing  a  prefilter 
which  changes  the  problem  into  this  binary  testing  prob¬ 
lem. 

Assume  that  y(()  is  an  independent  sequence  with  dis¬ 
tribution  po(y(f))  under  hypothesis  0  and  pi(i/(0)  under 
hypothesis  1.  Further,  assume  that  Bo  is  the  initial  hy¬ 
pothesis.  Because  y(t)  is  independent,  the  probability 
density  of  receiving  a  sequence  of  measurements  un¬ 
der  either  hypothesis,  conditioned  on  the  value  of  H,  is: 


p(yilH.)  = 

l=k 


The  likelihood  ratio  between  the  two  hypotheses  is 


-C(0, 


P(»5~MHo)p(py|H,) 

P(!^|Ho) 

rrEiMll 

Po(y(0) 


(3) 

(4) 


To  simplify  the  calculations  let  70(f)  =  log(Po(y(t))). 
7i(f)  -  log(pi(y(f))),  Hoi(f)  =  7i(t)  -  7o(0> 
5i(o,i)  =  EU  Hoi(f).  Then  the  decision  function  for 
a  change  from  state  0  to  state  1  is 

DF(0,l,yJ)=  inax  log£(0,  l,r,y5) 

••elo.n] 

which  results  in  the  binary  rule 

H, 

DF(0,1,pS)  >  T^.  (5) 

Ho 
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Figure  11:  Behavior  of  the  Page-Hinkley  stopping  rule 
to  a  simulated  change  in  mean  at  tick  126  for  a  Gaus¬ 
sian  process.  Signal  has  standard  deviation  of  1  before 
and  after  the  change,  and  mean  of  1.0  after  the  change. 
Change  is  detected  with  a  threshold  of  15  at  tick  149. 
The  estimate  of  the  time  of  change  is  the  last  time  th'^ 
test  equals  zero  which  is  at  tick  128. 

This  is  equivalent  to  the  Page-Hinkley  (PH)  cumulative 
sum  stopping  test 

OF(0,l,yS)  =  5J(0,1)-  min  55(0,1). 

0</<n 

This  test  minimizes  the  time  taken  to  reach  decision 
over  all  tests  that  have  the  same  false  alarm  rate  [25]. 
Further,  it  is  easily  computed  recursively  by 

DF{Q,  l,y5)  =  max(0,£>F(0, 1,  j^“')  -(- Hoi(n)). 

5.2.1  Change  in  Mean  in  Gaussian  Noise 

An  application  of  the  PH  test  to  a  change  in  mean 
in  Gaussian  noise  is  shown  in  figure  11.  This  figure  was 
generated  by  adding  a  mean  of  1.0  to  a  zero  mean  Gaus¬ 
sian  random  sequence  with  variance  I.  With  a  change 
threshold  of  15,  the  test  detects  the  change  at  tick  149 
for  a  delay  of  24  ticks.  The  change  time  is  estimated  to 
be  128.  In  this  particular  case 

=.oi(n)  =  - JT - (y(n) - - — ) 

where  V  is  the  variance  of  the  signal,  and  m  are  the 
means  under  the  two  hypotheses.  DF  is  simply  the  cu¬ 
mulative  sum  or  integration,  of  Hoi  with  resetting  at  0. 
The  computation  has  the  same  computational  cost  as 
the  alternative  of  lowpass  filtering  the  signal  and  thresh¬ 
olding  the  result  at  the  optimal  value  ft. 

5.2.2  Comparison  to  the  filtering  approach 
The  performance  of  the  PH  test  can  be  compared  to 

lowpass  filtering  followed  by  thresholding  on  tests  with 


Figure  12:  Receiver  operating  characteristic  (ROC)  of 
Page-Hinkley  test  between  two  Gaussian  distributions 
with  different  means  and  the  same  variance  as  a  function 
of  the  signal  to  ndse  ratio  s  =  ^.  The  logio(fy)  is 
shown  as  a  function  of  the  mean  time  to  detection  tj  for 
s  =0.5,  1.0,  1.5,  and  2.0. 

equivalent  false  alarm  rates.  The  asymptotic  false  alarm 
rate  and  time  to  detection  for  the  Page- 

Hinkley  test  can  be  approximated  by  applying  Wald’s 
identity  and  approximations  [25].  The  results  are 

»  |e^’ -  T*  -  l|//?o 

where 

Since  the  false  alarms  are  the  interarrival  times  of  a 
Bernoulli  process  they  are  geometrically  distributed. 
Therefore  the  asymptotic  false  alarm  rate  is 

PH  f  _  ^ 

J  -  PHip 

For  the  change  in  mean  between  two  Gaussian  processes 
with  the  same  standard  deviations  a,  0i  is 


A  plot  of  the  trade-off  between  the  time  to  detection, 
td,  and  the  time  to  false  alarm,f/  is  called  the  receiver 
operating  characteristic  (ROC).  It  is  a  function  of  the 
signal-to-noise  ratio  s  =  ^ .  Graph  12  shows  the  value 
of  fd  and  logjof/  parameterized  by  T  for  a  fixed  value 
of  s.  The  ROC  for  this  test  is  shown  in  figure  12  for 
s  =  0.5,  ^  .0, 1.5, 2.0.  Both  the  mean  time  to  a  false  alarm 
and  deci. 'on  increase  with  increasing  threshold.  At  a 
fixed  false  alarm  time,  an  increase  in  the  signal  to  noise 
ratio  will  decrease  the  time  to  detection. 

The  performance  of  the  alternative  test  of  lowpass  fil¬ 
tering  followed  by  thresholding  can  be  bounded  using  the 


I 


ROC  fct  LmiMH  fte  Twi 


following  asymptotic  approximation  derived  by  Ball  [10]. 
The  approximations  are  valid  in  the  limit  of  an  increas¬ 
ing  threshold  and  short  sampling  time.  Consider  a  filter 
resized  by  a  stable,  linear,  time  invariant  vector  process 
z 


x(k  +  1)  =  Az(k)  ■+  w(k  +  1)  +  Apu_i(i:  -  r) 


driven  by  a  white,  zero-mean,  Gaussian  noise  w(k)  with 
noise  intensity  Q.  A  change  of  size  A/i  is  applied  by  the 
unit  step  u-i  at  time  r.  The  covariance  of  z  satisfies 
the  discrete  Lyapunov  equation  S  =  ASA^  +  Q  and  the 
decision  function  is  DF{k)  =  x'^(k)S~^x{k)  In  princi¬ 
ple  it  is  possible  to  determine  Pr^y|(i;)  by  propagating 
the  density  for  z{k),  p(z,  k),  forward  in  time  and  then 
integrating  over  the  decision  region.  The  propagation 
equation  is 

p(x,lr+l)=  f  p„(z-A()p((,k)d( 

Jd 


where  D  =  {x  :  DF(k)  <  T^}.  Then  PifA(k)  is  given 
b,  - 

Pr(i:)  =  1  -  /  p(u,b)du. 

^  Jd 

Unfortunately  there  are  no  closed  form  solutions  to  this 
problem.  However  by  treating  the  discrete  system  as  a 
sampling  of  a  continuous  system,  an  approximation  valid 
for  large  k  can  be  determined.  Using  this  approximation, 
the  steady  state  false  alarm  rate  /  is  found  to  be  asymp¬ 
totically  bounded  by 


where  p  is  the  dimension  of  x.  In  the  case  of  a  first-order 
lag  filter  x{k  4-  1)  =  ax(Jfc)  +  u;(ib),  the  bound  is 

/o  <  1  -  exp  ^\/ir/2 ln(a)T exp“^’''^(l  -  1/T^)j  . 


This  is  the  bound  for  i*/S  >  T.  The  PH  test  is  equiva¬ 
lent  to  >  T  which  has  a  false-alarm  rate  bounded 

by  /o/2. 

To  approximate  PrD(ifc)  note  that  DF(k)  is  a 
noncentral  chi-squared  random  variable  with  p  de¬ 
grees  of  freedom  and  noncentrality  parameter  6^{k)  = 
x^{k)S~^x(k)  [2].  The  process  mean  x  satisfies 

x(ib  -f  1)  =  Ajz{k)  -P  Ap 

with  initial  condition  x(0)  =  0  for  a  change  in  mean 
of  A/i,  where  we  have  assumed  for  simplicity  r  =  0. 
If  the  cumulative  noncentral  Chi-square  distribution  of 
DF  at  value  T*  is  denoted  by  F(7’*,  S^,p),  then  PrD(ib) 
is  bounded  by 

Pr(*)>  l-F(T^6^p) 


which  can  be  computed  numerically  or  approximated. 

For  a  scalar,  first-order  lag-filter,  the  ROC  can  be 
computed  as  a  function  of  the  the  signal-to-noise  ratio 
s  as  in  the  PH  test.  In  this  case,  the  values  of  tj  and 
logjo  are  parameterized  by  a.  The  optimal  threshold 

for  the  test  is  where  5  =  This  gives  a 


Figure  13;  Receiver  operating  characteristic  (ROC)  of 
first  order  lag  filter  test  with  threshold  between  two 
Gaussian  distributions  with  different  means  and  the 
same  variance  as  a  function  of  the  signal  to  noise  ra¬ 
tio  s  =  The  logjo(i^^)  is  shown  as  a  function  of  the 
mean  time  to  detection 


threshold  of  =  (5)*  With  the  one-sided  test. 


an  approximation  for  Pr/>(b)  is  simply  the  probability  of 
drawing  a  value  greater  that  Ap/2  from  a  Gauss<an  ran¬ 
dom  sample  which  has  mean  x(ib)  and  variance  5,  given 
that  the  test  has  not  already  terminated.  The  probabil¬ 
ity  of  terminating  at  time  k  given  that  the  test  has  not 
already  terminated  is 


F[k)  =  1  -  erf 


The  probability  of  terminating  at  time  1:  is  then  given 
by  the  recursion 


Pd(0)  =  F{0) 

Poik)  =  F{k){l-PD{k-l)). 
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This  gives  an  underestimate  of  the  termination  time.  An 
overestimate  is  given  by  the  rise  time  for  x(ib)  to  A/i/2. 
Figure  13  shows  the  logarithm  of  </  as  a  function  of 
ta  for  a  signal-to-noise  ratio  of  s  =  0.5,  1.0,  1.5,  and  2 
computed  using  these  two  approximations.  The  curve  for 

5  =  0.5  has  been  cut  short,  because  the  approximation 
is  not  valid  for  small  ij. 

Figure  14  indicates  that  the  lowpass  filter  approach 
has  a  longer  delay  to  detection  compared  to  the  PH  test 
when  they  have  the  same  false  alarm  rate.  The  test 
shown  in  figure  11  will  signal  an  alarm  on  average  every 

6  X  10^  samples  and  the  change  will  be  detected  after  28 
saunples.  To  get  equivalent  performance  from  the  low- 
pass  filter,  a  must  equal  0.98.  With  this  value,  the  esti¬ 
mate  of  Id  is  29.5  and  the  rise  time  is  34.5.  These  results 
demonstrate  that  the  PH  test  gives  an  improvement  in 
performance  without  an  increase  in  computational  cost. 
In  addition,  an  estimate  of  the  change  time  is  possible 
with  a  small  amount  of  additional  storage. 


Figure  14:  Lowpass  filter  of  x{n  +  1)  =  0.98i:(n)  + 
0.02y(n+l)  on  the  same  signal  as  figure  4.  The  threshold 
is  0.50.  The  change  is  detected  at  153.  This  is  a  slower 
response  then  the  response  for  the  PH  test.  Further,  an 
estimate  of  the  change  time  is  not  computed. 

5.2.3  Change  between  Deterministic  Signals  in 
Additive  Gaussian  Noise 
The  sequential  hypothesis  approach  can  be  extended 
to  derive  tests  for  characteristics  that  are  detectable  with 
a  single  linear  filter.  A  direct  extension  to  the  binary 
problem  is  given  by  the  hypotheses: 

Ho  :  y(t)  =  so(t)  +  uo(t)  t  =  0,...,n 

Hi:  y{t)  =  si(t)  +  vi(t) 

where  s,  are  known  deterministic  signals.  A  test  equiv¬ 
alent  to  the  binary  case  results  with  71  (t)  redefined  as 

7,(t,  r)  =  log(p„_,(y(t)  -  s,(t  -  r))) 

where  p„  ^  are  the  known  probability  densities  for  the 
noise  process.  In  this  case,  DF(Q,  l,yo)  cannot  be  com¬ 
puted  recursively.  Instead,  the  maximization  must  be 
computed  by  exhaustive  search  over  a  growing  window 
of  length  n.  To  limit  the  increase  in  computational  com¬ 
plexity,  a  suboptimal  approach  is  usually  taken  where 
the  search  window  is  constrained  to  a  moving  window  of 
fixed  length. 

Because  of  the  time  invariance  of  s,  ,  the  complexity 
of  the  calculation  of  S|?(0, 1)  is  order  Lm  where  L  is 
the  number  of  points  in  the  window  and  m  is  the  cost 
of  computing  soi(n,r)  for  a  single  point.  To  see  this, 
suppose  that  5J?(0, 1)  has  brnn  computed  over  the  win¬ 
dow  and  stored  in  a  vector  5oi(n)  of  size  L.  When  data 
point  j/(n  -t-  1)  arrives  2oi(l,r)  must  be  computed  for 
every  r  and  t  in  the  window  and  then  summed.  How¬ 
ever,  the  sum  of  Hoi(t,r)  for  r  and  t  less  than  n  1 
has  already  been  computed  and  stored  in  5ot.  Thus 
by  shifting  5oi(n)  by  one  and  then  adding  Soi(n  -f  1,  r) 
to  the  shifted  result  we  obtain  5(n  +  1).  This  requires 
order  Lm  operations.  With  a  vector  processor  with  L 
elements,  the  entire  calculation  can  be  done  in  parallel 


in  m  operations.  5oi(n  -F  1)  must  then  be  searched  for 
the  maximum  element.  This  structure  for  the  calcula¬ 
tions  is  preserved  in  the  more  general  case  of  an  unknown 
change  magnitude  in  Gaussian  noise,  which  is  considered 
in  the  next  section. 

5.3  Unknown  Parameterized  Distributions 

In  most  applications  the  magnitude  of  the  change  is  un¬ 
known.  For  this  problem,  there  are  two  probability  mod¬ 
els  for  y(<) 

Ho:  y{t)  Po(y(0.®o) 

Hi:  y(t)  Pi(y(0,^i) 

where  00  is  a  known  parameter  vector  and  0]  is  unknown. 

00  is  an  element  of  ,  0i  is  element  of  ,  and  y 
is  an  element  of  37'’.  The  probability  densities  po  and 
P]  may  or  may  not  be  from  the  same  family.  If  the 
dimension  and  interpretation  of  0o  and  0i  are  the  same, 
one  approach  for  detecting  the  change  is  to  choose  a 
minimum  change  value  00  >  0  a  priori  and  then  run 
two  Page-Hinkley  tests  in  parallel  with  0i  =  0o  ±  00. 
The  test  which  terminates  first  is  used  as  the  decision 
rule.  Changes  of  parameter  greater  than  00  will  trigger 
a  decision,  but  the  delay  to  decision  will  be  longer  than 
with  the  correct  00. 

One  approach  to  choosing  a  value  for  00  is  to  select  a 
distribution  for  0]  a  priori .  In  this  case  the  measurement 
distribution,  under  Hi ,  is  the  convolution 

Pi(y(0IHi)  =  Pi(y(0.^i)®p«,- 

If  01  is  treated  as  a  sequence  of  random  variables  drawn 
from  pj, ,  then  the  Page-Hinkley  test  can  be  applied  as 
above. 

The  problem  of  choosing  a  distribution  for  0i  can  be 
eliminated  by  maximizing  C  over  both  0i  and  r.  This 
yields  the  generalized  likelihood  ratio  (GLR)  test  [30] 

DF(0,  l,yj)=  max  inaxS;‘(0, 1). 

0<r<n-j,  »i 

There  is  a  delay  of  at  least  q\  samples  before  detection 
so  that  01  can  be  estimated.  If  0o  is  also  unknown  the 
test  includes  one  more  maximization. 

5.3.1  Unknown  Change  in  Mean 

As  an  example  of  this  problem,  suppose  p,-  are  both 
Gaussian  densities  with  0o  =  (poiVb)  the  known  mean 
and  variance  before  the  change.  For  simplicity  assume 
Po  =  0.  After  the  change  the  process  has  a  new  mean 
Pi  but  the  same  variance  Vq.  Then  Hoi  is 

Hoi(pi ,  k)  =  pT Vo~^y(*')  -  iVVi • 

The  maximization  over  pi  yields  the  usual  sample  mean 
estimate  for  pi  from  r  to  n 

'  '  ksr 

and  the  decision  function  becomes 

n 

DF{0,l,y2)  =  max  VHoi(pi(r,n),fc) 

re[0,n-l]|^ 

(n-r-t-1)  .7.  ^ 

-  - o - t^i(r,n)Vo  ^Pi(r,n). 

r€(0,n-lj  Z 


Again,  the  maximization  is  limited  to  a  moving  window 
of  size  L  since  it  must  be  solved  by  exhaustive  search. 

Under  hypothesis  0,  DF  is  a  central  chi-squared 
random  variable  with  p/2  degrees-of-freedom  DF  ~ 
X^(p/2)  which  has  mean  p/2.  Therefore,  the  thresh¬ 
old  T  must  be  at  least  p/2.  Since  under  hypothesis  1, 
DF  is  noncentral  chi-squared  with  p  freedoms  and  non¬ 
centrality  parameter  the  detection  time  can 

be  approximated  by  tne  results  in  section  5.2.2. 

The  test  computation  can  be  computed  in  order  Lm, 
or  in  order  m  steps  by  a  vector  processor  of  size  L.  At 
time  n  the  value  of  pi(r,  n)  is  stored  for  every  point  in 
the  window  in  the  array  Af(n).  Each  column  of  Af  is  the 
estimate  of  pi  involving  the  final  j  measurements.  After 
receiving  y{n  -f  1),  all  of  the  columns  of  M  are  shifted 
right  by  1  and  then  each  column  j  is  updated  by 


n  -f  1)  =  Af*(i,  n)  +  (y(n  +  1)  -  M*(j,  n))/j 


where  Af*  indicates  the  shift  operation.  The  decision 
function  becomes 


DF{0, 1, Vo)  =  ^M{j, nfVo  ‘ Af(j, n) 

with  f  =  n  -  jmas  +  1. 


5.4  Multiple  Parameterized  Models 

In  the  most  general  case  each  hypothesis  Hp  in  the  col¬ 
lection  H  could  be  parameterized  by  a  different  number 
of  free  parameters  6p .  This  is  the  form  that  our  dynamic 
sensing  algorithm  takes.  The  decision  function  is  still 
the  likelihood  ratio  but  now  with  a  maximization  over 
three  different  parameter  sets 


■DF(p,9,yJ)  =  maxinin 


.  L{p,q,r,e'A) 

nm  max  log  — — - f-r — , 

#,  *  L{p,r,ep) 


where  6'^,  is  the  value  of  the  parameters  for  Hp  for  the 
data  from  0  to  r  -  1.  Again,  the  maximization  over  r 
is  explicit  and  so  only  a  maximum  number  of  moving 
windows  are  kept. 

Two  additional  issues  arise  in  this  problem.  First, 
models  with  more  free  parameters  have  greater  explana¬ 
tory  power.  That  is  they  are  able  to  fit  finite  length 
noise  signals  better  than  models  with  fewer  free  param¬ 
eters.  Therefore,  a  method  of  penalizing  the  number  of 
free  parameters  which  corrects  this  problem  is  required. 
The  next  subsection  discusses  this  issue. 

Second,  the  autoregressive  model  hypothesis,  hypoth¬ 
esis  5,  does  not  use  a  fixed  number  of  parameters,  and 
the  parameter  estimates  depend  upon  the  order  of  the 
model.  Therefore,  an  efficient  computationally  technique 
for  estimating  the  model  parameters  for  all  model  orders 
is  needed.  This  is  provided  by  estimation  of  the  reflec¬ 
tion  coefficients  which  is  discussed  in  section  5.4.2. 

Finally,  section  5.4.3  gives  the  detailed  procedure  for 
our  algorithm.  This  includes  initialization,  choice  of 
weighting  for  the  number  of  parameters,  the  estimation 
procedures,  and  the  decision  procedure. 


5.4.1  Model  Penalty 

A  problem  arises  with  the  unadjusted  likelihood  ra¬ 
tio  test  when  the  test  involves  models  with  different 
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numbers  of  free  parameters.  Models  with  more  free 
parameters,  or  degrees-of-freedom,  have  more  explana¬ 
tory  power.  That  is  they  are  able  to  fit  finite  lengths  of 
data  better  than  models  with  fewer  degrees-of-freedom. 
Therefore,  the  model  with  the  greatest  number  of  param¬ 
eters  will  always  be  the  most  likely.  In  order  to  correct 
this  problem,  a  uniform  method  of  penalizing  the  extra 
freedoms  is  required.  Note  that  even  if  all  the  models 
had  the  same  number  of  freedoms,  the  change  hypoth¬ 
esis  has  twice  as  many  freedoms  as  the  recursive  model 
because  because  it  uses  two  models. 

This  problem  is  ubiquitous  to  likelihood  approaches 
which  involved  comparisons  between  models  with  differ¬ 
ent  numbers  of  freedoms.  This  problem  is  eliminated 
in  the  Bayesian  approach  by  the  a  priori  choice  of  prior 
probabilities  for  the  number  of  models  and  parameters. 
The  log  of  these  prior  probabilities  adds  to  the  likelihood 
calculation  and  essential  creates  an  a  priori  model  cost. 
What  the  likelihood  approach  requires  is  a  principle  for 
choosing  these  model  costs. 

The  minimum  description  length  principle  [22,  21],  or 
MDL,  is  one  of  the  few  general  principles  for  choosing 
the  model  cost.  This  principle  states  that  the  cost  of  the 
model  is  related  to  the  number  of  parameters  or  bits  it 
takes  to  encode  the  model.  In  general,  simpler  expla¬ 
nations  are  preferred,  therefore  the  likelihood  should  be 
reduced  by  the  complexity  of  coding  the  model.  For  a 
number  of  linear  parameter  estimation  problems,  R  isa- 
nen  has  shown  that  the  model  cost,  defined  in  terms  of 
description  length,  is  asymptotically  |logn  where  k  is 
the  number  of  parameters  and  n  is  the  number  of  sam¬ 
ples  used  in  the  estimation. 

An  equivalent  result  was  derived  by  Schwarz  [23]. 
Schwarz  derived  an  asymptotic  expansion  to  the  optimal 
Bayes  procedure  by  assuming  a  fixed  error  cost  and  fairly 
general  conditions  for  the  prior  distribution  and  the  sam¬ 
ple  distributions.  Under  his  assumption  the  Bayes  pro¬ 
cedure  chooses  the  a  posteriori  most  probable  model  and 
parameter  values.  Asymptotically  this  is  equivalent  to 
maximizing  the  log-likelihood  minus  ^logn.  For  auto¬ 
regressive  (AR)  processes  this  procedure  was  shown  to 
be  strongly  consistent  (asymptotically)  by  Hemerly  and 
Davis  [11]. 

No  such  result  exists  for  the  still  popular,  and  useful, 
alternative  AIC  procedure  derived  by  Akaike  [1].  Akaike 
suggested  that  the  decision  criterion  should  be  based 
on  maximizing  the  expected  value  of  the  log-likelihood. 
When  all  the  models  use  the  same  underlying  probabil¬ 
ity  distribution,  the  penalty  becomes  the  difference  in 
expected  log-likelihood  between  two  competing  models 
when  the  actual  signal  is  white  noise.  For  parameter  esti¬ 
mation  problems,  this  gives  a  penalty  equal  to  the  num¬ 
ber  of  free  parameters.  For  AR  processes,  Shibata  [24] 
derived  the  distribution  of  the  number  of  free  parame¬ 
ters  estimate,  k,  using  this  penalty.  The  result  shows 
that  the  most  probable  k  equals  k  but  that  the  expected 
value  of  lb  is  generally  closer  to  lb  -f  1. 

All  of  these  results  are  asymptotic  and  so  for  small 
sample  sizes,  with  which  all  our  tests  work,  there  is  some 
freedom  in  choosing  the  penzdty.  Because  of  the  stronger 
theoretical  justification  of  the  MDL  criterion,  our  algo- 


rithms  utilizes  a  penalty  of  the  form  b^log{n)  where  b 
is  a  unit  parameter  cost.  In  addition,  since  our  proce¬ 
dures  use  data  sizes  of  between  20  and  40  samples  for  the 
moving  window  the  MDL  and  AIC  criterion  give  nearly 
equivalent  penalties.  Values  of  6  larger  than  one  favor 
simpler  hypotheses  more  heavily  than  the  MDL  crite¬ 
rion.  We  obtained  subjectively  good  results  with  b  from 

2.2  to  2.7. 

5.4.2  Linear  Models  and  Orthogonal 
Estimation 

This  section  discusses  the  efficient  computation  of  the 
order  and  parameter  estimates  for  linear  predictor  mod¬ 
els  in  Gaussian  white  noise.  All  of  the  models  used  in  our 
dynamic  sensing  algorithm  fit  the  framework  discussed 
here. 

The  general  form  of  a  linear  predictor  model  is: 
y(n)  =  +  c(n) 

where  is  a  vector  of  regression  coefficients,  6  is 

the  parameter  vector,  and  e  is  a  white  Gaussian  noise 
process.  For  example,  for  the  new  mean  model  HI 
=  1.  For  the  AR  model  with  m  free  coefficients 

=  [l/(n  - 1) . y(»»  -  »n)]. 

For  the  linear  predictor  model  the  maximum  likeli¬ 
hood  estimate  of  the  parameters  is  the  least-squares  es¬ 
timate.  The  least-squares  estimate  can  be  written  in 
matrix  form  by  collecting  the  measurements  into  a  vec¬ 
tor  as  y(n)  =  v5*,  and  collecting  each  element  of 
into  a  vector  as  s,  (n)  =  V’lOi  ■  Then  the  least-squares 
estimate  takes  the  form 

Jij(n)  =  (s.(n)|sy(n)) 

Xi{n)  =  {s,(n)iy(n)) 

=  ^V'’’(0y(0 

t=i 

^(n)  = 

where  (|)  denotes  inner  product.  I  is  called  the  empirical 
information  matrix  and  X  is  called  the  empirical  infor¬ 
mation  vector.  9(n)  is  the  parameter  estimate  at  time 

R. 

The  parameter  estimates  depend  upon  the  model  or¬ 
der  because  in  general  the  vectors  {sj  (n)}  are  not  orthog¬ 
onal.  An  efficient  estimation  procedure  for  the  model 
order  can  be  performed  by  first  orthogonalizing  the  vec¬ 
tors  {s;(n)}  and  then  reconstructing  the  estimate  for 
any  desired  order.  The  orthogonal  parameter  estimates 
are  called  the  the  reflection  coefficients  because  they 
are  related  to  the  amount  of  energy  reflected  back  at 
a  changing  impedance  junction  in  an  electrical  transmis¬ 
sion  line  [19]. 

When  the  computation  is  performed  on-line  only  X(n) 
and  X(n)  are  actual  kept  in  memory.  Therefore,  the 
reflection  coefficients  need  to  be  computed  from  these 
matrices.  If  the  vectors  {8j(n)}  were  actually  available, 
a  Gram-Schmidt  procedure  could  be  applied  to  the  vec¬ 
tors  to  generate  an  orthogonal  basis.  This  decomposition 


represents  the  matrix  S  =  [s,]  as  5  =  where  Q 

is  the  orthonormal  basis,  is  the  diagonal  matrix  of 
lengths  for  the  orthogonal  vectors,  and  R  is  the  corre¬ 
lation  structure  of  S.  The  time  dependence  has  been 
dropped  for  clarity.  Then,  the  reflection  coefficients,  K, 
satisfy  =  Q'^Y. 

Now  note  that  I  =  DR,  since  Q  is  or¬ 
thonormal,  and  X  —  D^I^Q'^Y .  Therefore 

the  reflection  coefficients  are  generated  by  the  first  stage 
of  Gaussian  elimination.  Gaussi2m  elimination  factors  1 
into  DR.  Therefore,  the  reflection  coefficients  are  the 
solution  to  R"^ DK  =  X.  This  solution  is  just  the  result 
of  the  first  stage  of  Gaussian  elimination.  The  second 
step,  back  substitution,  solves  Rj9  =  K. 

Now  the  reflection  coefficients  can  be  used  to  recon¬ 
struct  the  solution  for  any  model  order.  Given  any  order 
model  m,  let  the  first  m  reflection  coefficients  be  A'm 
and  the  corresponding  submatrix  of  R  be  Rm-  Then  the 
original  model  coefficients  for  an  order  m  model  can  be 
determined  from  Km  =  RmOm- 

More  importantly,  the  reflection  coefficients  can  be 
used  to  immediately  determine  the  optimal  model  order. 
Let  Eo  =  (yiy)  be  the  original  signal  energy.  Then  be¬ 
cause  of  orthogonality,  the  energy  remaining  after  using 
the  reflection  coefficients  is  Em  =  Em-i~kf/Dm,m, 
and  the  adjusted  log-likelihood  of  this  model  is  l{Y,  m)  = 

— 0.5n(log(£m/n)  -b  1)  -  6^^  fogl”)-  The  model  order 
which  maximizes  l(Y,  m)  is  optimal  under  this  criterion. 

Although  the  computations  can  be  reorganized  for 
better  numerical  performance,  this  is  the  essence  of  the 
ladder  or  lattice  algorithm.  In  summary  the  estimation 
procedure  stores  J,  X,  and  the  energy  in  the  input  signal 
Eo  and  then  performs  the  following  update  procedure: 

1 .  Update  the  regression  vector  rl>  based  on  the  model 
equation. 

2.  Update  Eq. 

Eo*-  Eo  +  y(n)^y(n) 

3.  Update  the  information  vector  and  matrix. 

I  ^ 

X  ^  A'-l-^^y(n) 

4.  Decompose  I  into  2  =  R^ DR. 

5.  Solve  for  the  reflection  coefficients. 

R^DK  =  X 

6.  Determine  the  optimal  model  order  m*  by  maxi¬ 
mizing  the  adjusted  log-likelihood  /(m). 

E„  —  Em-l-kj/Dm,m 
l{m)  *-  -O.5n(log(£7„/n)  + l)-6^^^^5^^1og(n) 

7.  Solve  for  the  optimal  model  parameters. 

Om*  *  R  Km* 

When  the  number  of  model  parameters  is  fixed,  the  fast- 
gmn  algorithm  can  be  used  to  update  the  parameter  es¬ 
timates  with  a  fewer  number  of  computations  [18]  but 
the  essentials  are  the  same. 


Figure  15;  Generalized  Likelihood  segmentation  proce¬ 
dure  with  multiple  hypotheses.  A  single  recursive  model 
and  a  collect'  i  of  moving  models  must  be  updated  for 
every  new  ^asurement. 

5.4.3  Computational  Organisation 

The  sequential  nature  of  the  test  leads  to  an  organiza¬ 
tion  of  the  computations  that  minimizes  the  number  of 
computations.  A  conceptual  picture  of  the  computations 
is  shown  in  figure  15.  At  any  stage  in  the  computation 
there  is  one  hypothesis  being  grown  recursively  and  a 
collection  of  L  moving  window  models  for  every  alterna¬ 
tive  hypothesis.  A  minimum  length  Lmin  is  chosen  for 
the  moving  windows  based  on  estimation  requirements. 

Since  all  of  the  hypotheses  have  linear  parameters  in 
Gaussian  noise,  the  least  squares  estimate  of  the  param¬ 
eters  is  the  maximum  likelihood  estimate.  Therefore, 
for  every  window  w,  both  recursive  and  moving,  and  hy¬ 
pothesis  p  we  store  Dp(w)  which  contains:  1)  the  current 
value  of  the  empirical  information  matrix  Zp(w),  2)  the 
current  value  of  the  empirical  information  vector  A'p(u;), 
3)  the  current  value  of  the  parameters  9p(w),  and  4)  the 
current  value  of  the  adjusted  log-likelihood  lp{w). 

All  of  the  moving  windows  for  each  hypothesis  are  col¬ 
lected  into  a  structure  Adp(n)  =  {Dp(l),...,Dp(/moa:)}. 
For  window  lengths  less  than  lmin,  the  parameters  are 
not  esiimated.  When  a  new  data  point  y(n  -I- 1)  is  mea¬ 
sured  the  moving  windows  are  updated  by; 

1.  The  values  in  Mp  are  shifted  by  1  for  every  hypoth¬ 
esis. 

2.  Xp(w),  Jp(w),  Op(w),  and  /p(u;)  are  updated  for  ev¬ 
ery  window  using  the  algorithm  described  in  sec¬ 
tion  5.4.2. 

Let  the  current  hypothesis  hy  H*.  For  this  cur¬ 
rent  hypothesis  a  collection  of  recursive  windows  = 
{£>*(!), ...,  D*(lmax)}  is  stored.  In  this  case  each  D(i) 
structure  contains  the  information  vector  and  matrix 


the  sum  of  the  likelihood  for  all  possible  pairs  of  moving 
models  with  the  matching  recursive  model.  If  the  change 
likelihood  is  greater  a  change  is  detected. 

Several  steps  are  necessary  to  update  the  computation 
after  detecting  a  change.  First  the  change  time  is  taken 
as  the  new  time  origin.  Then,  the  number  of  moving 
models  is  adjusted  so  that  only  models  that  lie  entirely 
after  the  detected  change  time  are  kept.  Finally,  recur¬ 
sive  models  starting  at  time  r,  with  lengths  from  Lmin 
to  n  —  r  -(- 1  must  be  recomputed  in  a  batch  mode.  The 
computation  then  proceeds  from  this  adjusted  state. 

To  initialize  the  computation,  each  hypothesis  is  eval¬ 
uated  on  a  minimum  length  window  Lmin  in  a  batch 
mode.  The  optimal  hypothesis  is  computed  by  maximiz¬ 
ing  the  likelihoods  after  adjusting  for  a  model  cost.  The 
maximal  hypothesis  is  then  stored  as  a  single  recursive 
model. 

This  computation  produces  at  every  time  t,  lagged  by 
Lmax  =  Lmin  +  L  —  l,&  hypothesis  model,  its  associated 
parameter  vector,  an  estimate  of  y,  and  the  model’s  ad¬ 
justed  likelihood.  In  addition,  a  sequence  of  event  marks 
are  produced  as  each  change  in  model  is  detected.  If  the 
hypothesis  models  are  chosen  well,  the  output  of  the  al¬ 
gorithm  is  essential  a  sequence  of  symbols  for  the  data 
which  can  be  used  for  further  task  recognition  process¬ 
ing. 

6  Experiments  and  Results 

The  algorithms  developed  in  the  previous  section  were 
applied  to  segmenting  the  contact  signal  hypotheses  de¬ 
veloped  in  section  4  on  both  the  selected  data  samples 
and  more  general  examples.  In  addition,  the  algorithms 
were  simplified  and  specialized  to  particular  problems 
to  show  their  utility  in  detecting  important  but  minute 
changes  in  real-time.  The  results  should  be  evaluated  in 
terms  of  three  criteria:  1)  do  the  algorithms  divide  the 
signal  into  segments  that  make  sense,  2)  are  the  hypothe¬ 
ses  identified  by  this  context  free  algorithm  the  ones  we 
expect,  and  3)  could  the  results  be  interpreted  given  a 
context.  The  results  show  that  the  algorithms  do  a  good 
job  of  segmenting  the  signal,  but  that  a  context  free 
interpretation  is  very  difficult.  However,  by  fixing  the 
context  for  simple  cases,  robust  interpretations  should 
be  possible. 

6.1  Performance  on  design  set 

The  general  multi-model  generalized  likelihood  ratio  test 
was  applied  to  the  set  of  data  used  for  developing  the 
model.  There  were  six.  basic  models 


based  on  data  from  0  to  n— t-l-l.  When  a  new  data  point 
y(n  + 1)  is  measured  the  recursive  windows  are  updated 
by: 

1.  The  values  in  K  are  shifted  by  1. 

2.  The  values  in  Z7*(l)  are  updated  using  the  algo¬ 
rithm  described  in  section  5.4.2  except  now  the 
model  order  is  fixed. 

After  updating  ail  the  moving  windows  and  the  re¬ 
cursive  window,  the  algorithm  examines  the  results  to 
detect  a  change.  The  likelihood  of  the  current  recur¬ 
sive  model  Z7*(l)  plus  the  threshold  7^  is  compared  to 


Null  State  (HO):  j/(t)  ~  Vo) 

New  Mean  (HI):  y(t)  —  N(fi,  Vo) 

New  Noise  Level  (H2);  y(l)  ~  JV(fio,  Vi) 

New  Mean  and 

Noise  Level  (H3);  y(i)  -  lV(pi ,  Vj ) 

Impact  Signal  (H4);  y(f)  =  +  P  +  e(l) 

e(0  ~  ArfoTVi) 

Grasping  Signal  (H5):  y(l)  =  0;y(<  -  j)  -t-  e(t) 

e(<)~N(0,Vi). 

For  each  model,  moving  windows  with  15  to  40  sam¬ 
ples,  inclusively,  were  used.  The  number  of  free  parame- 


Io|(loglikelibood) 


Figure  16:  Impact  Signal  incorrectly  marked  at  tick  200. 


ters  was  penalized  using  the  MDL  criterion  with  6  =  2.2. 
In  addition,  impacts  cause  such  large  changes  in  variance 
that  changes  were  signaled  when  the  first  data  point  of 
an  impact  entered  a  moving  window.  Therefore,  a  per¬ 
sistence  test  of  20  was  necessary  to  get  the  correct  mark¬ 
ing  of  the  change  time.  The  length  of  minimum  window 
was  chosen  based  on  the  length  required  for  estimation, 
while  the  maximum  was  chosen  based  on  the  size  of  the 
persistence  test.  The  decision  threshold  was  set  at  12. 

6.2  Performance  on  task  examples 
6.2.1  Impacts 

The  training  set  for  impacts  was  used  to  get  a  mea¬ 
surement  of  performance.  The  classification  for  72  ex¬ 
amples  was: 
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The  results  show  that  impact  hypotheses  (H4)  and  the 
change  in  variance  hypothesis  (HI)  are  difficult  to  distin¬ 
guish  and  suggests  the  need  to  consider  context.  Selec¬ 
tion  of  H4  depends  upon  this  signal  having  some  consis¬ 
tency  from  sample  to  sample.  This  is  only  occasionally 
the  case,  and  so  the  simpler  explanation  HI  is  frequently 
chosen.  In  real  world  impact  situations,  this  problem  re¬ 
sulted  in  confusion  of  the  impact  model  (H4)  with  the 
change  in  new  mean  and  noise  model  (H3)  and  the  grasp¬ 
ing  signal  model  (H5).  In  the  case  of  H5,  a  set  of  high 
frequency  poles  was  estimated. 

However,  the  introduction  of  context  should  make  dis¬ 
tinguishing  the  hypotheses  much  easier.  An  example  of  a 
typical  unsuccessful  marking  is  shown  in  figure  16.  The 
top  plot  shows  the  signal,  the  next  plot  shows  the  se¬ 
quence  of  hypotheses  and  the  transition  marks,  the  third 
plot  shows  the  variance  estimate,  and  the  bottom  plot 


Figure  17:  Impact  Signal  correctly  marked. 


shows  the  log  of  the  loglikelihood  for  the  top  two  hy¬ 
potheses.  Estimates  of  the  coefficients  for. each  of  the 
models  is  also  produced  by  the  procedure,  but  is  not 
shown.  The  figure  shows  than  in  unsuccessful  marking 
the  variance  estimate  has  a  sudden  increase  followed  by 
a  sudden  decrease.  It  also  shows  that  the  likelihood  indi¬ 
cates  an  event  very  clearly,  at  the  impact  time,  and  that 
the  two  top  hypotheses,  which  includes  the  impact  hy¬ 
pothesis,  are  very  close  during  the  impact.  It  should  be 
possible  to  use  this  additional  information  to  add  con¬ 
text  and  get  more  accurate  markings.  Finally,  the  resid¬ 
ual  vibration  after  the  impact  is  matched  to  hypothesis 
3  because  of  the  change  in  mean  caused  by  the  hammer 
resting  on  the  sensor.  A  successful  marking  is  shown  in 
figure  17.  The  impact  signal  is  modeled  as  H4  and  the 
residual  after  the  impact  is  modeled  with  H5.  Again,  HI 
and  H4  are  close  during  the  impact  event. 

The  transition  marks  are  generally  close  to  the  actual 
transition.  The  transition  time  error  for  the  training  set 
was: 


Distance  to  transition 

0  1  2  3  >3 

48  12  2  1  9 

The  transition  times  are  within  three  87%  of  the  time. 
Errors  larger  than  3  were  caused  by  small  transitions  in 
the  signal  that  occurred  just  before  the  impact.  In  this 
case,  the  persistence  test  prevents  a  second  change  at 
the  impact  location  and  thus  causes  a  large  error. 

The  performance  on  the  training  examples  indicate 
that  impacts  are  very  difficult  to  detect  in  a  context  free 
form.  They  are  easily  confused  with  jumps  in  the  driv¬ 
ing  variance  for  the  other  models.  This  is  confirmed 
in  a  more  general  impact  experiments.  To  get  a  sense 
of  rerfotmance  for  general  manipulation,  the  fingertip 
wao  held  by  hand  and  then  struck  against  an  aluminum 
block.  One  hundred  and  three  examples  were  taken,  and 
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the  impact  point  was  selected  by  looking  for  the  first  out¬ 
lier  that  was  20  or  more  standard  deviations  away  from 
the  initial  signal.  The  initial  part  of  each  of  these  sig¬ 
nals  is  much  more  variable  and  the  impacts  are  usually 
followed  by  a  large  change  in  mean  from  the  sustained 
contact.  For  these  examples  the  classification  was: 
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This  cleuly  shows  that  impacts  are  confused  with  H3 
and  HI.  However,  the  segmentation  times  are  still  very 
good  with  93%  of  the  times  falling  within  3  of  the  correct 
time. 
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Even  in  the  case  of  an  error,  the  variance  shows  a  char¬ 
acteristic  increase  and  then  decrease  over  time.  This  can 
be  used  to  improve  labeling  at  a  context  level. 

6.2.2  Changes  in  texture 

Several  basic  change  in  texture  experiments  were  per¬ 
formed.  The  first  used  the  slip  apparatus  to  move  the 
sensor  over  the  aluminum  bottom  and  then  over  a  piece 
of  180  grit  sandpaper.  An  example  segmentation  from 
this  experiment  is  shown  in  figure  18.  In  all  of  these  ex¬ 
periments,  the  high  vibration  section  was  matched  with 
H5.  The  moment  of  impact  with  the  sandpaper  is  gener¬ 
ally  indicated  by  a  sudden  rise  in  variance.  In  this  par¬ 
ticular  plot,  the  contact  occurs  at  tick  1240.  It  was  not 
marked  because  the  likelihood  did  not  pass  the  persis- 
lonce  test,  however  the  change  in  variance  was  detected. 

In  all  the  plots  at  least  two  levels  of  vibration,  as  indi¬ 
cated  by  the  variance  estimate,  are  always  marked:  one 
for  sliding  on  the  aluminum,  and  one  for  the  sandpa¬ 
per.  In  figure  18  this  can  be  clearly  seen  in  the  variance 
estimate  at  tick  500  and  1500. 

The  importance  of  this  result,  and  the  other  texture 
experiments  is  that  substantial  (over  500  samples)  sec¬ 
tions  of  each  signal  are  marked  as  statistically  similar. 
This  is  a  huge  reduction  in  the  size  of  the  input  to  a 
higher  level  recognizer.  The  algorithms  reduce  large 
pieces  of  the  raw  signal  into  a  single  variance  nnd  set 
of  estimation  parameters. 

This  can  be  clearly  seen  in  two  additional  texture  ex¬ 
periments.  In  each  of  these  experiments  the  sensor  was 
moved  by  hand  in  order  to  get  smooth,  low  vibration, 
motion.  In  the  first  experiment,  the  fingertip  sensor  was 
pulled  over  a  smooth  plastic  surface  and  then  over  4 
sheets  of  ordinary  paper  (figure  19).  In  the  second,  the 
fingertip  was  pulled  across  a  calibrated  surface  roughness 
mrtde  with  a  lathe  (figure  20). 

In  figure  19  the  transition  to  sliding  over  the  paper 
is  marked  at  tick  200.  The  algorithm  detects  both  a 
change  in  the  variance,  which  is  caused  by  the  difference 
in  roughness  between  the  two  surfaces,  and  a  change  in 
the  spectrum  of  the  signal.  In  figure  20  the  repetitive 
pattern  of  the  lathed  surface  texture  can  be  seen  in  the 
signal.  The  algorithm  models  this  with  the  autoregres¬ 
sive  coefficients.  The  algorithm  detects  the  jump  in  the 
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Figure  18:  Example  of  change  in  texture:  Sliding  over 
an  aluminum  surface  followed  by  sliding  over  180  grit 
sandpaper. 
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Figure  19:  Example  of  change  in  texture:  Section  of 
sliding  over  a  plastic  table-top  followed  by  four  sheets 
of  paper.  The  mark  is  placed  at  the  point  of  change 
between  the  two  surfaces.  In  addition  to  a  change  in 
variance,  the  algorithm  detects  a  change  in  the  spectral 
.  t  of  the  signal  as  indicated  by  the  AR  coefficients. 
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Figure  20:  Example  of  change  in  texture:  Motion  across 
a  calibration  surface  roughness  calibration  surface.  The 
fingertip  begins  sliding  over  the  surface  at  approximately 
tick  180.  The  surface  was  made  with  a  lathe  and  has  half- 
sinusoidal  features  of  height  0.002  in  and  width  0.030  in. 


spectrum  when  the  fingertip  begins  to  move  across  the 
textured  surface.  If  the  velocity  of  the  sensor  could  be 
controlled,  or  modelled,  the  autoregressive  coefficients 
could  be  potential  used  for  identifying  or  sorting  tex¬ 
tures. 


6.3  Summary 

The  results  show  that  the  algorithm  tests  for  char¬ 
acteristics  that  changed  when  the  contact  conditions 
changed.  The  technique  produced  generally  accurate  es¬ 
timates  of  the  change  times,  and  was  sensitive  to  small 
variations  in  the  contact  conditions.  Our  experiments 
showed  that  changes  between  small  texture  variations 
were  detectable  when  the  sensor  was  moved  by  hand.  It 
was  less  sensitive  under  computer  control,  possibly  be¬ 
cause  of  the  additional  robot  vibration.  The  results  show 
that  the  algorithm  provides  a  method  of  discrimination 
between  different  contact  conditions. 

Identification,  or  labeling,  of  the  correct  signal  source 
was  problematic.  This  paper  discussed  work  aimed  at 
characterizing  the  statistical  source  of  the  data,  and  then 
using  that  source  as  a  label.  The  autoregressive  hypoth¬ 
esis  H5  was  often  selected  as  the  most  probable  even  in 
the  developmental  data  sets.  In  more  general  manipu¬ 
lation,  where  some  low  frequency  interaction  force  will 
always  be  present,  this  hypothesis  will  become  even  more 
probable.  Therefore,  future  work  will  explore  using  only 
the  AR  signal  model 
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as  a  basis  for  segmentation.  This  model  encompasses 
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almost  all  of  the  other  models  and  therefore  little  seg¬ 
mentation  power  will  be  lost. 

In  addition,  this  model  produces  segments  by  fitting 
a  spectrum  to  the  local  data.  Therefore,  all  of  the  data 
in  a  segment  will  have  approximately  the  same  spectral 
content.  Preliminary  work  shows  that  it  may  be  possible 
to  use  the  AR  coefficients  for  texture  sorting.  The  same 
texture  tended  to  produce  an  equivalent  spectrum  when 
the  sensor  was  moved  at  the  same  velocity.  Additional 
work  in  adjusting  for  sensor  velocity  and  characterizing 
the  performance  of  this  technique  remains  to  be  done. 

In  addition,  a  task  level  explanation  of  the  results  re¬ 
quires  a  task  model  or  task  context  which  was  delib¬ 
erately  not  introduced  into  this  study.  Future  work  is 
aimed  at  examining  this  problem.  A  context  can  provide 
both  a  interpretation  framework,  and  a  guide  for  placing 
better  segmentation  boundaries.  The  framework  may 
depend  upon  position  measurements  and  should  predict 
the  expected  signal  parameters. 

7  Conclusion 

This  paper  examined  the  problem  of  temporal  segmen¬ 
tation  of  manipulation  force  signals.  Substantially  more 
information,  useful  for  manipulation,  then  contact/no 
contact  is  carried  by  the  contact  force  signals.  A  better 
understanding  of  the  signal  characteristics  and  their  rela¬ 
tionship  to  task  models  would  provide  a  rich  framework 
for  controlling  manipulation. 

The  first  problem  in  designing  a  system  to  interpret 
the  force  signals,  is  to  segment  the  signal  into  statisti¬ 
cally  equivalent  classes.  The  sequential  hypothesis  test¬ 
ing  method  was  introduced  as  a  general  tool  for  produc¬ 
ing  segment  boundaries,  and  was  applied  to  the  problem 
of  segmenting  the  individual  strains  from  a  6-axis  force 
torque  sensor. 

Experiments  with  isolated  simple  examples  was  used 
to  develop  a  set  of  statistical  source  hypotheses  for  the 
data.  These  experiments  resulted  in  the  development 
of  six  basic  statistical  models.  Impact  signals  were  the 
most  difficult  signal  to  model  and  detect.  A  great  many 
procedures  and  models  were  attempted  before  settling 
on  a  training  data  based  model  and  the  GLR  test. 

The  GLR  test  produced  accurate  segmentation 
boundaries  and  event  time  marks.  In  addition,  the  re¬ 
sults  showed  that  an  autoregressive  model  was  often  the 
most  power  hypothesis.  We  expect  that  this  model  will 
be  sufficient,  by  itself,  for  producing  good  segmentations 
of  the  signal. 

In  addition,  the  framework  produced  some  simple 
tests  which  can  be  used  in  place  of  threshold  procedures 
and  which  have  better  performance  and  theoretical  jus¬ 
tification  with  no  additional  computational  cost.  An  ap¬ 
propriate  filter  followed  by  the  Page-Hinkley  test  can  be 
used  to  test  for  many  simple  conditions.  In  some  prelim¬ 
inary  experiments,  we  were  able  to  detect  the  onset  of 
fingertip  slip,  by  feeding  the  absolute  value  of  a  highpass 
signal  into  the  Page-Hinkley  test. 

One  of  the  most  important  design  lessons  is  that  the 
high  frequency  component  of  the  force  signal  carries  es¬ 
sential  information.  During  manipulation,  the  robot  fin¬ 
gers  will  exert  time  varying  lowpass  forces.  Therefore, 


Level  4: 

Associating  Dynamics  with  Objects 
in  tbe  Environment 

Level  3: 

Characterizing  Dynamics  of  Interaction 

Level  2: 

Association  of  Signal  Statistics 
with  Regions  of  Space 

Level  1: 

Network  of  Expected  Signal  Statistics 

Level  0: 

Characterizing  Sensor  Signal  Statistics 

Table  1:  Hierarchy  of  haptic  explanations. 


any  unexpected  events,  like  impact  or  slip  must  be  de¬ 
tected  in  the  high  frequency  range.  Lowpass  filtering 
the  sensors  to  eliminate  this  "noise”  removes  a  major 
information  channel. 

Soft  covers,  used  for  better  gripping,  for  the  aluminum 
fingertips  have  the  effect  of  lowpass  filtering  the  contact 
signals.  Therefore,  a  two  part  sensor  that  uses  a  high  fre¬ 
quency  sensor  embedded  in  the  cover,  and  the  fingertip 
sensor  for  low  frequency  signals  needs  to  be  developed. 
The  high  frequency  sensory  can  be  coarsely  distributed 
and  have  poor  directional  sensitivity.  It’s  only  the  vibra¬ 
tion  magnitude  that  is  important. 

Sensors  should  also  be  designed  with  critical  damping 
properties.  The  fingertip  sensor  is  lightly  damped  and 
rings  when  excited  at  high  frequencies.  Light  damping 
makes  modeling  the  force  signal  source  more  difficult,  be¬ 
cause  near  the  natural  frequency  the  signal  is  a  reflection 
of  the  sensor’s  dynamics  and  not  the  input  signal. 

Although  the  full  algorithm  implemented  in  this  paper 
does  not  run  in  real-time,  we  feel  that  a  less  complete  set 
of  tests  based  purely  on  the  AR  models  could  be  made 
to  run  in  real-time  on  a  DSP  processor.  The  steps  in 
the  algorithm  are  highly  vectorized  and  could  be  run  in 
parallel.  We  are  investigating  these  possibilities. 

Additional  discrimination  is  possible  by  examining  the 
spatio-temporal  distribution  of  the  contact  signal.  Local 
slip  could  be  distinguished  from  an  overall  vibration  by 
examining  this  distribution.  However,  this  requires  the 
development  of  new  array  sensor  technologies. 

The  procedures  reported  on  in  this  paper  provide  a 
powerful  front-end  to  an  algorithm  for  interpreting  the 
state  of  contact.  Such  an  interpretation  will  require  task 
context.  This  can  be  provided  with  several  approaches 
each  with  different  levels  of  explanatory  power.  We  en¬ 
vision  the  hierarchy  of  haptic  explanations  shown  in  ta¬ 
ble  1 .  This  sequence  of  levels  of  interpretation  provides 
the  basis  for  our  long-term  research  agenda  in  dynamic 
haptic  sensing.  Our  current  research  is  aimed  at  inves¬ 
tigating  levels  1  and  2  to  provide  a  contextual  basis  for 
interpreting  the  sensor  signal  statistics,  and  the  bottom 
level  was  investigated  in  this  paper. 

Level  1  is  the  lowest  level  of  context.  In  this  level  an 
expected  sequence  or  network  of  expected  hypotheses  is 
associated  with  a  task.  For  example  in  a  change  of  tex¬ 
ture  task,  the  algorithm  would  be  told  that  two  textures 
are  present  and  that  it  is  expected  that  a  transition  from 
one  to  the  other  will  occur.  Each  texture  would  be  char¬ 
acterized  by  the  parameters  that  the  level  0  algorithm  is 
expected  to  estimate  for  each  region. 


The  next  level  of  complexity  is  to  associate  the  ex¬ 
pected  hypotheses  and  their  parameters  with  a  region 
of  the  workspace.  This  approach  again  builds  a  graph, 
but  now  the  graph  is  indexed  by  regions  of  space.  This 
approach  requires  measurements  of  position. 

With  the  measurement  of  position,  a  different  graph 
can  be  constructed  (level  3).  This  graph  would  predict 
relationships  between  force  signtUs  and  the  position  sig¬ 
nals.  The  elements  in  the  graph  become  the  possible  dy¬ 
namic  relationships  between  position  and  contact  force. 
This  would  require  knowledge  of  the  possible  physics 
available  in  the  task. 

Finally,  at  the  highest  level  of  context  the  dynamics 
and  the  signals  are  associated  with  objects  in  the  en¬ 
vironment.  This  would  require  approximate  knowledge 
about  tbe  shape  and  associated  physics  for  the  objects 
that  the  robot  might  interact  with.  This  could  either  be 
programmed  in  or  be  acquired  through  sensing  with  a 
combination  of  haptic  sensors  and  vision. 

This  paper  addressed  the  problem  of  dynamic  con¬ 
tact  sensing.  Dynamic  models  of  different  basic  con¬ 
tact  events  were  developed  and  used  to  derive  a  sta¬ 
tistical  segmentation  algorithm  based  on  the  general¬ 
ized  likelihood  ratio  teat.  This  test  provides  a  power¬ 
ful  model-based  framework  for  developing  segmentation 
algorithms.  The  procedure  was  shown  to  be  capable  of 
segmenting  a  number  of  useful  signals  into. statistically 
equivalent  pieces.  This  resulted  in  a  huge  reduction  in 
the  amount  of  data  that  has  to  be  given  to  a  higher  level 
recognizer.  Finally,  we  presented  a  framework  for  future 
study  of  higher  level  recognizers  and  discussed  the  chal¬ 
lenges  presented  by  our  approach  to  perception  based 
robot  programming. 
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